Detector and method for estimating data probability in a multi-channel receiver

ABSTRACT

A detector and method for estimating channel data probability in a multi-user or multiple-input multiple-output communication system includes summing conditional bit probabilities conditioned on hypothetical channel data patterns over stochastically selected hypothetical channel data patterns. Various detailed hardware structures and circuits are also described.

This application claims the benefit of U.S. Application Ser. No.60/586,360 filed on 7 Jul. 2004, entitled “Detector and method forestimating user data probability in a multi-user receiver,” which isherein incorporated by reference in it's entirety for all purposes.

This invention was made with government support under NSF Grant NoECS0121389 awarded by the National Science Foundation. The Governmenthas certain rights to this invention.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates generally to wireless communication. Moreparticularly, the present invention relates to techniques for jointdetection of multiple-input multiple-output (MIMO) and multi-user codedivision multiple access (CDMA) signals.

2. Related Art

Wireless communications have become ubiquitous. Improving theperformance and capacity of wireless communications systems is highlydesirable.

Many wireless communications systems make use of code division multipleaccess (CDMA) to enable multiple users to share a common frequencybandwidth. CDMA can provide high capacity in certain wireless systems,for example cellular networks. In CDMA, users' transmissions are encodedwith spreading codes. Ideally, spreading codes for different users allowthe transmissions for different users to be separated withoutinterference. In practice, some interference (“multi-user interference”)occurs, which can be eliminated by various multi-user detectionalgorithms known in the art, including interference cancellation.Performance of interference cancellers has sometimes provided limitedperformance, in part due to errors in estimating the interference causedbetween users.

Improved interference cancellation techniques have been developed thatuse iterative processing. The typical iterative approach uses the outputof the interference canceller to estimate user data symbols, which arefed back into the interference canceller to provide improvedcancellation of interference and successively refined estimates of theuser symbols. This typical approach, however, suffers from variousproblems, including no guarantee of convergence, and suboptimalperformance. Other known approaches frequently realize only a fractionof the potential channel capacity.

An alternate approach to dealing with multi-user interference is toperform joint demodulation of the multiple users. For example, maximumlikelihood sequence estimation (MLSE) may be performed which accountsfor both multi-user interference and symbol-to-symbol memory introducedby forward error correction (FEC) encoding. Although MLSE can provideexcellent performance, MLSE is prohibitively complex when there are alarge numbers of users since the complexity of MLSE grows exponentiallywith the number of users. One method that avoids this exponentialcomplexity, yet results in the same performance as MLSE is minimum meansquare error (MMSE) detector with soft cancellation of X. Wang and H. V.Poor, “Iterative (Turbo) Soft Interference Cancellation and Decoding forCoded CDMA”, IEEE Trans. Commun., vol. 47, no. 7, pp. 1046-1061, July1999. A K user system may in general require inversion of a K-by-Kmatrix for each user symbol and for each iteration. Hence, theprocessing complexity of suboptimum receivers can be of the order of K³.Although some simplification of the processing can be obtained usingiterative steps, such approaches still require processing complexity ofthe order of K² or greater per each user symbol for each iteration.Hence industry has continued to search for improved multi-user detectiontechniques.

New wireless communications techniques, such as multiple-inputmultiple-output (MIMO) are also being introduced which have thepotential to provide high capacity. In MIMO, multiple antennas at atransmitter are used, where different symbols may be transmitted on eachantenna, providing increased capacity. Multiple antennas at the receiverare typically required, to allow the separation of the symbols from eachtransmit antenna. Theoretically, MIMO channels can provide capacitywhich increases linearly with the number of antennas, but like CDMA,interference between the different symbols from different antennas setslimits on practical application. Hence, handling interference becomes animportant aspect of achieving the potential capacity of a MIMO system.

SUMMARY OF THE INVENTION

It has been recognized that it would be advantageous to develop atechnique that provides good performance in the presence ofmulti-channel interference such as that present in CDMA and MIMOsystems.

The invention includes a method for estimating channel data bitprobabilities of a multi-channel signal received through a communicationchannel. The method may include summing a conditional bit probability toobtain a channel data bit probability summation. The conditional bitprobability may be conditioned on an observation of the multi-channelsignal and a hypothetical channel data pattern. The summation may beperformed over a subset of hypothetical channel data patterns. Themethod may include selecting the hypothetical channel data patternsusing a Markov chain Monte Carlo simulation to stochastically select thesubset of hypothetical channel data patterns. The hypothetical channeldata patterns may be selected to correspond to dominant conditional bitprobability terms in the channel data bit probability summation.

Additional features and advantages of the invention will be apparentfrom the detailed description which follows, taken in conjunction withthe accompanying drawings, which together illustrate, by way of example,features of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is flow chart of a method for estimating channel data bitprobabilities of a multi-channel signal received through a communicationchannel in accordance with an embodiment of the present invention;

FIG. 2 is a flow chart of an alternate method for estimating channeldata probabilities in a multi-channel communication system receiver inaccordance with an embodiment of the present invention;

FIG. 3 is transition diagram of a Markov chain for a three-channelsystem in accordance with an embodiment of the present invention;

FIG. 4 is a block diagram of a multi-channel receiver in accordance withan embodiment of the present invention;

FIG. 5 is a block diagram of a detector for estimating channel dataprobabilities in a multi-channel communication system in accordance withan embodiment of the present invention;

FIG. 6 is a block diagram of a Gibbs sampler circuit in accordance withan embodiment of the present invention;

FIG. 7 is a circuit diagram of a ψ-calculator circuit in accordance withan embodiment of the present invention;

FIG. 8 is a circuit diagram of a ξ-calculator circuit in accordance withan embodiment of the present invention; and

FIG. 9 is a circuit diagram of a log-likelihood calculator circuit inaccordance with an embodiment of the present invention.

DETAILED DESCRIPTION

Reference will now be made to the exemplary embodiments illustrated inthe drawings, and specific language will be used herein to describe thesame. It will nevertheless be understood that no limitation of the scopeof the invention is thereby intended. Alterations and furthermodifications of the inventive features illustrated herein, andadditional applications of the principles of the inventions asillustrated herein, which would occur to one skilled in the relevant artand having possession of this disclosure, are to be considered withinthe scope of the invention.

The following notations are adhered to throughout this application.Vectors are denoted by lowercase bold letters. Matrices are denoted byuppercase bold letters. The ij-th element of a matrix, say, A is denotedby a_(ij). The superscripts ^(T) and ^(H) are used to denote matrix orvector transpose and Hermitian, respectively. Integer subscripts areused to distinguish different channels. Integer time indices are put inbrackets. For example, d_(k)(i) is used to denote the ith symbol of thechannel k. In most expressions, the time index, i, is omitted forbrevity. The notations P(·) and p(·) denote the probability of adiscrete random variable and the probability density of a continuousrandom variable, respectively.

One aspect of the presently disclosed inventive techniques involvesrecognizing that a MIMO communication system resembles a CDMA system.The data streams transmitted at each MIMO antenna are thus analogous tousers in a CDMA system, and the channel gains between each transmitantenna and the receive antennas is a vector analogous to a spreadingcodes of a CDMA user. Accordingly, the number of receive antennas playsthe same role as the spreading gain, N, in a multi-user system, and thenumber of transmit antennas plays the same role as the number of users,K. Hence, the methods developed for multi-user detection can beimmediately applied to a MIMO system with K transmit and N receiveantennas. K/N can thus be referred to as the system load, and the MIMOchannel is thus under-loaded when K<N, fully loaded when K=N, andover-loaded when K>N. Accordingly, throughout this application the termmulti-channel will thus be used in a general sense to refer to both CDMAand MIMO systems. Within a multi-user CDMA system, for example whereindividual users are each encoded with a different spreading code, theterm channel refers to an individual user within a composite CDMAsignal. Within a MIMO system, for example where different data streamsare transmitted by transmit antennae, the term channel refers to aparticular data stream transmitted on one antenna; such a channel maycorrespond to one or more users. Of course, variations on these basictypes of systems will occur to one of skill in the art, and combinationsof CDMA and MIMO techniques are also known to which the disclosedinventive techniques are equally applicable.

A multi-channel CDMA or MIMO communication system with K channels willnow be described in general terms. For simplicity, it is assumed thatthe channels are synchronous, although this is not be essential. Eachchannel has a spreading sequences of length N, i.e., a spreading gain ofN. Using d(i)=[d₁(i) d₂(i) . . . d_(K)(i)]^(T) to denote data symbolstransmitted by the K channels in the i^(th) time slot and A(i) to denotethe matrix containing spreading code (scaled with the channel gain) ofthe k^(th) channel in its k^(th) column, the received signal vector, isgiven byy(i)=A(i)d(i)+n(i)   (1)where n(i) is the channel additive noise. The received signal may besampled at a chip rate, or another convenient rate as will occur to oneof skill in the art.

For simplicity, it is assumed in the derivation that the samples of n(i)are zero-mean, independent, and identically distributed andE[n(i)n^(H)(i)]=σ_(n) ²I. In practice, noise frequently deviatesslightly from this assumption without substantially degrading theperformance of communication systems. Data symbols are transmitted anddecoded in blocks of length M and for each block the time index i takesthe values of 1 to M. Various block sizes will be suitable dependingupon the communication system application.

As most receivers include forward error correction decoding, an estimateof the data log-likelihood ratio (LLR) is used, given by

$\begin{matrix}{{{\lambda_{1}\left( {d_{k}(i)} \right)} = {\ln\;\frac{P\left( {{{d_{k}(i)} = {{+ 1}❘{y(i)}}},{\lambda_{2}^{e}(i)}} \right)}{P\left( {{{d_{k}(i)} = {{- 1}❘{y(i)}}},{\lambda_{2}^{e}(i)}} \right)}}},} & (2)\end{matrix}$where λ₂ ^(e)(i) is extrinsic information provided by a forward errorcorrection decoder, when present, as discussed further below. Estimatingthe values of λ₁(d_(k)(i)) in a computationally efficient manner is asignificant benefit of the presently disclosed inventive techniques. Forexample, note that

$\begin{matrix}\begin{matrix}{P\left( {{d_{k}(i)} = {{+ 1}❘{y(i)}}} \right)} \\{= {\sum\limits_{d_{- k}{(i)}}{P\left( {{{d_{k}(i)} = {+ 1}},{{d_{- k}(i)}❘{y(i)}}} \right)}}} \\{{= {\sum\limits_{d_{- k}{(i)}}{{P\left( {{{d_{k}(i)} = {{+ 1}❘{y(i)}}},{d_{- k}(i)}} \right)}{P\left( {{d_{- k}(i)}❘{y(i)}} \right)}}}},}\end{matrix} & (4)\end{matrix}$where the second identity follows by applying the chain rule,d_(−k)(i)=[d₁(i) . . . d_(k−1)(i) d_(k+1)(i) . . . d_(K)(i)]^(T) and thesummation is over all possible values of d_(−k)(i) . The number ofcombinations that d_(−k)(i) can take grows exponentially with K and thusbecomes prohibitive for large values of K.

A first method which can avoid this prohibitive complexity will now bedescribed. FIG. 1 illustrates a flow chart of a method, shown generallyat 100, for estimating a channel data bit probability, e.g.P(d_(k)(i)=+1|y(i)) , of a multi-channel signal received through acommunication channel. The method includes summing 102 a conditional bitprobability, e.g. P(d_(k)(i)=+1|y(i), d_(−k)(i)), conditioned on anobservation of the multi-channel signal, e.g. y(i), and a hypotheticaldata pattern, e.g. d_(−k)(i) to obtain a channel data bit probabilitysummation. Rather than summing the conditional bit probability over allpossible data patterns, the summation is performed over a subset of thehypothetical data patterns, e.g. d_(−k)(i) . Hence, the method includesselecting 104 the subset of data patterns corresponding to dominantconditional bit probability terms in the channel data bit probabilitysummation. In other words, the summation is performed substantially overterms where the conditional probability is significant, and terms wherethe conditional probability is significant are substantially omitted.

The selection of hypothetical channel data patterns can be performedusing a Markov chain Monte Carlo (MCMC) simulation to stochasticallyselect the subset of data patterns corresponding to dominant conditionalbit probability terms in the channel data bit probability summation. Bysumming a subset of data patterns corresponding to the dominant terms,rather than summing over all possible data values, a significantreduction in the processing can be obtained relative to an exactcalculation of the channel data bit probability summation. The MCMCsimulation helps to ensure that the subset of data patterns selectedcorrespond to the dominant terms. Dominant terms are those terms in thesummation that are relatively large and thus have correspondinglygreater impact on the summation than terms that are relatively small. Asignificant benefit of the method is the ability to obtain an accurateestimate of the channel data bit probability from a relatively smallsubset of hypothetical channel data patterns, by including primarilydominant terms in the summation. An additional benefit of the method isimproved accuracy of the channel data probability estimates relative tosome prior art approaches which do not perform probability summations.

Of course, because the conditional probability summation can benormalized, it is not essential that all of the dominant terms areincluded, or that all non-dominant terms are excluded, in order toobtain a good estimate of the channel data bit probability. The MCMCsimulation, being a stochastic process, will not precisely select allthe largest terms and exclude all the smallest terms. Instead, the MCMCsimulation helps to ensure that most of the hypothetical data patternsincluded in the summation are dominant terms.

In order to describe in further detail how the MCMC simulation may beused to select dominant conditional bit probability terms, a generaldescription of Bayesian estimation using Monte Carlo integration will beprovided. Consider the problem of evaluating the weighted mean of afunction h(x) of a random variable X, given a weighting function ƒ(x),viz.

$\begin{matrix}{{{E_{f}\left\lbrack {h(X)} \right\rbrack} = {\int_{\chi}{{h(x)}{f(x)}{\mathbb{d}x}}}},} & (5)\end{matrix}$where χ is the domain of X and ƒ(·) is a proper density function, i.e.,ƒ(x)≧0 for x∈χ and

∫_(χ)f(x)𝕕x = 1.An estimate of (5) can be obtained by evaluating the empirical average

$\begin{matrix}{{\overset{\_}{h} = {\frac{1}{N_{s}}{\sum\limits_{n = 1}^{N_{s}}{h\left( x_{n} \right)}}}},} & (6)\end{matrix}$where x_(n)'s are samples from the distribution ƒ(x) . Moreover, thespeed of convergence of the method can be evaluated by estimating thevariance of h which can be obtained by

$\begin{matrix}{\sigma_{\overset{\_}{h}}^{2} = {\frac{1}{N_{s}\left( {N_{s} - 1} \right)}{\sum\limits_{n = 1}^{N_{s}}{{{{h\left( x_{n} \right)} - \overset{\_}{h}}}^{2}.}}}} & (7)\end{matrix}$Here, x may be a scalar or a vector variable. When x is a vector, theintegral in (5) is a multiple integral whose direct computation maybecome prohibitive as the dimension of x increases. From (7), theaccuracy of the estimate (6) reduces with the square of the number ofsample points. Moreover, numerical studies show that the dimension of xand the size N_(s) are very weakly related in the sense that even thoughthe dimension of x may increase, the order of magnitude of N_(s) remainsunchanged. Hence, by using a MCMC simulation to select hypothetical datapatterns in the channel data probability summation, the exponentialgrowth of computational complexity that is commonly encountered inperforming multiple integrals may be avoided.

The Monte Carlo integration include importance sampling. In the methodof importance sampling E_(f)[h(X)] is evaluated by performing theempirical average

$\begin{matrix}{{{E_{f}\left\lbrack {h(X)} \right\rbrack} \approx {\frac{1}{N_{s}}{\sum\limits_{n = 1}^{N_{s}}{\frac{f\left( x_{n} \right)}{f_{a}\left( x_{n} \right)}{h\left( x_{n} \right)}}}}},} & (8)\end{matrix}$where the samples x_(n) are chosen from the auxiliary distributionƒ_(a)(x). Equation (8) follows from the alternative representation of(5)

$\begin{matrix}{{E_{f}\left\lbrack {h(X)} \right\rbrack} = {\int_{X}{{h(x)}\frac{f(x)}{f_{a}(x)}{f_{a}(x)}{\mathbb{d}x}}}} & (9)\end{matrix}$and performing the integral using Monte Carlo integration based on thedistribution ƒ_(a)(x). Similar to ƒ(x), ƒ_(a)(x) is preferably a properdensity function, i.e., it satisfies the conditions ƒ_(a)(x)≧0 for x ∈ Xand

∫_(X)f_(a)(x)𝕕x = 1.When ƒ_(a)(x)=ƒ(x), (8) reduces to regular Monte Carlo integration.

It turns out that, for a fixed value of N_(s), many choices of ƒ_(a)(x)that are different from ƒ(x) may result in more accurate evaluation ofE_(ƒ)[h(x)]. In particular, one auxiliary distribution ƒ_(a)(x) thatallows accurate evaluation of E_(ƒ)[h(x)] with minimum number of samplesis

f_(a, o)(x) = h(x)f(x)/∫_(X)h(x)f(x)𝕕x.When available, this auxiliary distribution allows exact evaluation ofE_(ƒ)[h(x)] by using a single sample. However, ƒ_(a,o)(x) depends on theintegral

∫_(X)h(x)f(x)𝕕xwhich may not be available. In fact, when h(x)>0 for x ∈ X, the latteris the integral seeking to be solved. A practical approximation that hasbeen motivated from importance sampling which usually results in abetter approximation than (8) is

$\begin{matrix}{{\overset{\_}{h} = \frac{\sum\limits_{n = 1}^{N_{s}}{\frac{f\left( x_{n} \right)}{f_{a}\left( x_{n} \right)}{h\left( x_{n} \right)}}}{\sum\limits_{n = 1}^{N_{s}}\frac{f\left( x_{n} \right)}{f_{a}\left( x_{n} \right)}}},} & (10)\end{matrix}$where here also, x_(n) is chosen from the distribution ƒ_(a)(x). Aspecial case of (10), of particular interest, is obtained when ƒ_(a)(x)is uniform over a selected range of x. In order to minimize N_(s), thisrange is chosen such that it covers the values of x for which ƒ(x)h(x)is significant. In other words, x is limited to the important samplesthat contribute to the desired integral. When ƒ_(a)(x) is uniform, (10)simplifies to

$\begin{matrix}{\overset{\_}{h} = {\frac{\sum\limits_{n = 1}^{N_{s}}{{f\left( x_{n} \right)}{h\left( x_{n} \right)}}}{\sum\limits_{n = 1}^{N_{s}}{f\left( x_{n} \right)}}.}} & (11)\end{matrix}$Other auxiliary distribution can be used as well, as discussed furtherherein. For a general description of Monte Carlo methods, see G.Fishman, Monte Carlo: concepts, algorithms and applications, New York:Springer-Verlag, 1996 and C. P. Robert and G. Casella, Monte CarloStatistical Methods: Springer-Verlag, New York, 1999.

So, to perform the summation of (4), the MCMC simulation is used toselect a subset of terms. For simplicity in notation, the time index ‘i’is dropped from all the involved variables, e.g., P(d_(k)|y) is ashort-hand notation for P(d_(k)(i)|y(i)).

As discussed further below, it may be desirable to incorporateadditional information into the conditional bit probability. Forexample, extrinsic information from a forward error correction decodermay be used to improve the LLR. If such extrinsic information is givenby λ₂ ^(e), the channel data bit probability summation can be expressedas

$\begin{matrix}\begin{matrix}{P\left( {{{d_{k}(i)} = {{+ 1}❘{y(i)}}},{\lambda_{2}^{e}(i)}} \right)} \\{= {\sum\limits_{d_{- k}{(i)}}{P\left( {{{d_{k}(i)} = {+ 1}},{{d_{- k}(i)}❘{y(i)}},{\lambda_{2}^{e}(i)}} \right)}}} \\{{= {\sum\limits_{d_{- k}{(i)}}{{P\left( {{{d_{k}(i)} = {{+ 1}❘{y(i)}}},{d_{- k}(i)},{\lambda_{2}^{e}(i)}} \right)}{P\left( {{{d_{- k}(i)}❘{y(i)}},{\lambda_{2}^{e}(i)}} \right)}}}},}\end{matrix} & (12)\end{matrix}$This formulation will be used in the following derivations, although oneof skill in the art will understand that the presently disclosedinventive techniques can also be applied in the absence of extrinsicinformation from a forward error correction decoder.

In the subsequent equations the time index “i” is not included forbrevity of the presentation.

To obtain a summation similar to (6) for computation of P(d_(k)=+1|y,λ₂^(e)), P(d_(−k)|y,λ₂ ^(e)) is used as the density function, ƒ(x), andP(d_(k)=+1|y,d_(−k),λ₂ ^(e)) is used as the function whose weighted sumis to be obtained, h(x) . An estimate of P(d_(k)=+1|y, λ₂ ^(e)) is thusobtained by evaluating the empirical average

$\begin{matrix}{{{P\left( {{d_{k} = {{+ 1}❘y}},\lambda_{2}^{e}} \right)} \approx {\frac{1}{N_{s}}{\sum\limits_{n = 1}^{N_{s}}{P\left( {{d_{k} = {{+ 1}❘y}},d_{- k}^{(n)},\lambda_{2}^{e}} \right)}}}},} & (13)\end{matrix}$where d_(−k) ^((n)) are the samples that are chosen from thedistribution P(d_(−k)|y,λ₂ ^(e)), and, when appearing in equations like(13), d_(−k) ^((n)) is the shorthand notation for d_(−k)=d_(−k) ^((n)).

Starting with (11), treat P(d_(−k)|y,λ₂ ^(e)) as ƒ(x), and P(d_(k)=+1|y,d_(−k),λ₂ ^(e)) as h(x). This gives

$\begin{matrix}{{{P\left( {{d_{k} = {{+ 1}❘y}},\lambda_{2}^{e}} \right)} \approx \frac{\sum\limits_{n = 1}^{N_{s}}{{P\left( {{d_{k} = {{+ 1}❘y}},d_{- k}^{(n)},\lambda_{2}^{e}} \right)}{P\left( {{d_{- k}^{(n)}❘y},\lambda_{2}^{e}} \right)}}}{\sum\limits_{n = 1}^{N_{s}}{P\left( {{d_{- k}^{(n)}❘y},\lambda_{2}^{e}} \right)}}},} & (14)\end{matrix}$where the samples d_(−k) ^((n)) are chosen from a uniform distribution.

Equations (13) and (14) thus present alternate approaches to estimatingthe channel data bit probabilities; equation (13) using an auxiliarydistribution set equal to a conditional channel data bit probability,and equation (14) using a uniform distribution. Various auxiliarydistributions may be chosen, as will become apparent to one of skill inthe art and having possession of this disclosure.

In accordance with one embodiment of the present invention, using thecomputation based on (13), P(d_(k)=+1|y,d_(−k) ^((n)),λ₂ ^(e)), isevaluated for n=1,2, . . . ,N_(s). For this, the LLR is defined as

$\begin{matrix}{{\lambda_{1}^{(n)}\left( d_{k} \right)} = {\ln\frac{\;{P\left( {{d_{k} = {{+ 1}❘y}},d_{- k}^{(n)},\lambda_{2}^{e}} \right)}}{\;{P\left( {{d_{k} = {{- 1}❘y}},d_{- k}^{(n)},\lambda_{2}^{e}} \right)}}}} & (16)\end{matrix}$and can be expanded as

$\begin{matrix}\begin{matrix}{{\lambda_{1}^{(n)}\left( d_{k} \right)} = {\ln\;\frac{p\left( {{y❘d_{- k}^{(n)}},{d_{k} = {+ 1}}} \right){P\left( {d_{- k}^{(n)},{d_{k} = {{+ 1}❘\lambda_{2}^{e}}}} \right)}}{p\left( {{y❘d_{- k}^{(n)}},{d_{k} = {- 1}}} \right){P\left( {d_{- k}^{(n)},{d_{k} = {{- 1}❘\lambda_{2}^{e}}}} \right)}}}} \\{= {{\ln\frac{p\left( {{y❘d_{- k}^{(n)}},{d_{k} = {+ 1}}} \right)}{p\left( {{y❘d_{- k}^{(n)}},{d_{k} = {- 1}}} \right)}} + {\lambda_{2}^{e}\left( d_{k} \right)}}} \\{= {{\frac{1}{2\sigma_{n}^{2}}\left( {{{y - {A_{- k}d_{- k}} + a_{k}}}^{2} - {{y - {A_{- k}d_{- k}} - a_{k}}}^{2}} \right)} +}} \\{\lambda_{2}^{e}\left( d_{k} \right)} \\{= {{\frac{2}{\sigma_{n}^{2}}\Re\left\{ {a_{k}^{H}\left( {y - {A_{- k}d_{- k}}} \right)} \right\}} + {\lambda_{2}^{e}\left( d_{k} \right)}}} \\{= {{\frac{2}{\sigma_{n}^{2}}\Re\left\{ {y_{k}^{MF} - {\sum\limits_{\underset{l \neq k}{l = 0}}^{K}{\rho_{kl}d_{l}}}} \right\}} + {\lambda_{2}^{e}\left( d_{k} \right)}}}\end{matrix} & (17)\end{matrix}$where

{·} denotes the real part, A_(−k) is A with its kth column, a_(k),removed, y_(k) ^(MF)=a_(k) ^(H)y is the matched filter output for thek^(th) channel and ρ_(kl)=a_(k) ^(H)a_(l) is the crosscorrelationbetween channels k and l. The simplification of the second line followswhen the extrinsic information provided by each element of λ₂ ^(e) isindependent of those provided by its other elements, for example whenthere is interleaving as discussed below. If the elements of d areindependent of one another this implies that

$\begin{matrix}\begin{matrix}{{\ln\;\frac{P\left( {d_{- k}^{(n)},{d_{k} = {{+ 1}❘\lambda_{2}^{e}}}} \right)}{P\left( {d_{- k}^{(n)},{d_{k} = {{- 1}❘\lambda_{2}^{e}}}} \right)}} = {{\ln\frac{P\left( {d_{- k}^{(n)}❘\lambda_{2,{- k}}^{e}} \right)}{P\left( {d_{- k}^{(n)}❘\lambda_{2,{- k}}^{e}} \right)}} +}} \\{\ln\;\frac{P\left( {d_{k} = {{+ 1}❘{\lambda_{2}^{e}\left( d_{k} \right)}}} \right)}{P\left( {d_{k} = {{- 1}❘{\lambda_{2}^{e}\left( d_{k} \right)}}} \right)}} \\{= {\ln\;\frac{P\left( {d_{k} = {{+ 1}❘{\lambda_{2}^{e}\left( d_{k} \right)}}} \right)}{P\left( {d_{k} = {{- 1}❘{\lambda_{2}^{e}\left( d_{k} \right)}}} \right)}}} \\{{= {\lambda_{2}^{e}\left( d_{k} \right)}},}\end{matrix} & (18)\end{matrix}$where λ_(2,−k) ^(e) is λ₂ ^(e) with λ₂ ^(e)(d_(k)) dropped from it andthe last identity follows from the definition of L-value. The third linein (17) follows since

${{p\left( {y❘d} \right)} = {\frac{1}{\left( {2{\pi\sigma}_{n}^{2}} \right)^{N/2}}{\mathbb{e}}^{{{- {{y - {Ad}}}^{2}}/2}\sigma_{n}^{2}}}},$where |·| denotes the length of a vector. Once λ₁ ^((n))(d_(k)) isobtained, recalling that P(d_(k)=−1|y,d_(−k) ^((n)),λ₂^(e))=1−P(d_(k)=+1|y,d_(−k) ^((n)), λ₂ ^(e)) and solving (16) forP(d_(k)=+1|y,d_(−k) ^((n)),λ₂ ^(e)),

$\begin{matrix}{{P\;\left( {{d_{k} = \left. {+ 1} \middle| y \right.},d_{- k}^{(n)},\lambda_{2}^{e}} \right)} = {\frac{1}{1 + {\exp\left( {- {\lambda_{1}^{(n)}\left( d_{k} \right)}} \right)}}.}} & (19)\end{matrix}$This is used in (13) for computation P(d_(k)=+1|y,λ₂ ^(e)). Next, formthe LLR by calculating

${\lambda_{1}\left( d_{k} \right)} = {\frac{P\left( {{d_{k} = \left. {+ 1} \middle| y \right.},\lambda_{2}^{e}} \right)}{1 - {P\left( {{d_{k} = \left. {+ 1} \middle| y \right.},\lambda_{2}^{e}} \right)}}.}$If desired, extrinsic LLR information λ₁ ^(e)(d_(k)) may be derived fromλ₁ ^(e)(d _(k))=λ₁(d _(k))−λ₂ ^(e)(d _(k)).   (15)For example, extrinsic LLR information may be exchanged with a forwarderror correction decoder as discussed further below.

In accordance with another embodiment of the present invention, usingcomputation based on (14), the LLR can be estimated from

$\begin{matrix}{{\lambda_{1}\left( d_{k} \right)} = {\ln\;{\frac{P\left( {{d_{k} = \left. {+ 1} \middle| y \right.},\lambda_{2}^{e}} \right)}{P\;\left( {{d_{k} = \left. {- 1} \middle| y \right.},\lambda_{2}^{e}} \right)}.}}} & (20)\end{matrix}$Substituting (14) and its dual when d_(k=−)1 in (20), yields

$\begin{matrix}{{\lambda_{1}\left( d_{k} \right)} = {\ln\;{\frac{\sum\limits_{n = 1}^{N_{s}}{P\;\left( {{d_{k} = \left. {+ 1} \middle| y \right.},d_{- k}^{(n)},\lambda_{2}^{e}} \right)\;{P\left( {\left. d_{- k}^{(n)} \middle| y \right.,\lambda_{2}^{e}} \right)}}}{\sum\limits_{n = 1}^{N_{s}}{{P\left( {{d_{k} = \left. {- 1} \middle| y \right.},d_{- k}^{(n)},\lambda_{2}^{e}} \right)}\;{P\left( {\left. d_{- k}^{(n)} \middle| y \right.,\lambda_{2}^{e}} \right)}}}.}}} & (21)\end{matrix}$Using the Bayes rule,

$\begin{matrix}\begin{matrix}{{P\left( {\left. d_{- k}^{(n)} \middle| y \right.,\lambda_{2}^{e}} \right)} = \frac{p\left( {d_{- k}^{(n)},\left. y \middle| \lambda_{2}^{e} \right.} \right)}{p\left( y \middle| \lambda_{2}^{e} \right)}} \\{= \frac{{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,\lambda_{2}^{e}} \right)}\;{P^{e}\left( d_{- k}^{(n)} \right)}}{p\left( y \middle| \lambda_{2}^{e} \right)}}\end{matrix} & (22)\end{matrix}$where P^(e)(d_(−k) ^((n))) is short-hand notation for P(d_(−k) ^((n))|λ₂^(e)), i.e., the probability of d_(−k)=d_(−k) ^((n)) given the availableextrinsic information. Similarly,

$\begin{matrix}{{P\left( {{d_{k} = \left. {+ 1} \middle| y \right.},d_{- k}^{(n)},\lambda_{2}^{e}} \right)} = {\frac{{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,{d_{k} = {+ 1}}} \right)}\;{P^{e}\left( {d_{k} = {+ 1}} \right)}}{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,\lambda_{2}^{e}} \right)}.}} & (23)\end{matrix}$Substituting (23), the dual of (23) with d_(k)=+1 replaced by d_(k)=−1,and (22) in (21),

$\begin{matrix}\begin{matrix}{{\lambda_{1}\left( d_{k} \right)} = {\ln\;{\frac{\sum\limits_{n = 1}^{N_{s}}{{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,{d_{k} = {+ 1}}} \right)}\;{P^{e}\left( d_{- k}^{(n)} \right)}}}{\sum\limits_{n = 1}^{N_{s}}{{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,{d_{k} = {- 1}}} \right)}\;{P^{e}\left( d_{- k}^{(n)} \right)}}} \cdot \frac{P^{e}\left( {d_{k} = {+ 1}} \right)}{P^{e}\left( {d_{k} = {- 1}} \right)}}}} \\{= {{\ln\;\frac{\sum\limits_{n = 1}^{N_{s}}{{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,{d_{k} = {+ 1}}} \right)}\;{P^{e}\left( d_{- k}^{(n)} \right)}}}{\sum\limits_{n = 1}^{N_{s}}{{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,{d_{k} = {- 1}}} \right)}\;{P^{e}\left( d_{- k}^{(n)} \right)}}}} + {{\lambda_{2}^{e}\left( d_{k} \right)}.}}}\end{matrix} & (24)\end{matrix}$Recalling that λ₁ ^(e)(d_(k))=λ₁(d_(k))−λ₂ ^(e)(d_(k)), from (24),

$\begin{matrix}{{\lambda_{1}^{e}\left( d_{k} \right)} = {\ln\;{\frac{\sum\limits_{n = 1}^{N_{s}}{{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,{d_{k} = {+ 1}}} \right)}\;{P^{e}\left( d_{- k}^{(n)} \right)}}}{\sum\limits_{n = 1}^{N_{s}}{{p\left( {\left. y \middle| d_{- k}^{(n)} \right.,{d_{k} = {- 1}}} \right)}\;{P^{e}\left( d_{- k}^{(n)} \right)}}}.}}} & (25)\end{matrix}$

Considering the MCMC simulation in further detail, a Markov chain can beconstructed based on the channel model (1) and the a posterioriprobability P(d|y,λ₂ ^(e)) . It can be shown that this Markov chain isan irreducible, aperiodic and reversible chain. It thus follows that,upon convergence of the chain, its stationary distribution matches thatof P(d|y,λ₂ ^(e)). Hence, the method may include constructing a MCMCsimulation whose states are determined by different selections of thedata vector d=[d₁ d₂ . . . d_(K)]^(T). For example, in a system withthree channels and binary data, the state assignments may be chosen asshown in TABLE 1. Each state transition occurs as a result of flippingone of the bits in d. The selection of a bit is random and has equalprobability for all the bits in d. The probability of transition isbased on the a posteriori probability P(d_(k)=+1|y,d_(−k),λ₂ ^(e)). Inother words, the transition probability can be set to correspond toconditional channel data bit probabilities, for example based on a modelof the communication channel. Considering the three channel example ofTABLE 1, if the current state is s₀ and d₂ is chosen for examination,the next state will be either s₀ or s₂. The decision to remain in s₀ ormove to s₂ is made according to the following rule. (i) Calculate the aposteriori probability p⁺=P(d₂=+1|y,d₁=−1,d₃=−1,λ₂ ^(e)(d₂)). (ii) Moveto state s₂ with the probability p⁺, otherwise remain in state s₀.

TABLE 1 Exemplary State Assignments for a System of Three Binary DataChannels State d₃ d₂ d₁ s₀ −1 −1 −1 s₁ −1 −1 +1 s₂ −1 +1 −1 s₃ −1 +1 +1s₄ +1 −1 −1 s₅ +1 −1 +1 s₆ +1 +1 −1 s₇ +1 +1 +1

FIG. 3 presents a transition graph of this Markov chain. Possibletransition between the states are indicated by arrows. These arrows arecalled edges. Presence of an edge indicates possible transition with anon-zero probability. If the probability of transition from one state toanother is zero, no edge will connect the two states.

Note that, as channel noise decreases, the probability of sometransitions may diminish to zero and others may approach one. Thismeans, in the absence of the channel noise, some of the edges will beabsent, and thus most of the useful properties of the underlying Markovchain may disappear. A practical consequence of this is that the MCMCsimulation may exhibit a modal behavior: tendency to stay in certainstates for long periods of time, failing to provide a good sampling.Modal behavior may be observed in a MCMC simulator at high signal tonoise ratio, for example above 8 dB, in a MIMO system with four transmitantennas and quadrature signaling. In this situation, the Markov chainmay have a tendency to spend long periods of time in the same state,causing insufficient variation in the hypothetical channel data samples,in turn causing poor estimates of the channel data probabilities and loglikelihood ratios.

This modal behavior can therefore be avoided by running the MCMCsimulation as though the signal to noise ratio of the received signal islower than the actual signal to noise ratio in order to ensure goodstatistical properties from the MCMC simulation. For example, theauxiliary distribution may be chosen to be a conditional channel dataprobability estimated at a low signal to noise ratio. Use of anauxiliary distribution corresponding to a low signal to noise ratio, forexample below 8 dB, or more particularly below 5 dB, avoids modalbehavior of the MCMC simulator helps to prevent this problem byproviding more perturbation in the MCMC simulator. Estimation of thechannel data probability for higher signal to noise ratio isaccomplished by scaling the calculated bit probability distribution toaccount for the auxiliary distribution. In other words, the correctvalue of the noise variance is used in calculating the probabilities in(25). Computer simulations reveal that this approach performs well,especially in highly loaded cases when K>2N.

A few observations from FIG. 3 are in order; these generalizations wouldbe understood by one of skill in the art to apply to a system with anarbitrary number of channels:

-   -   For an observed vector y and given extrinsic information vector        λ₂ ^(e), the a posteriori probabilities P(d_(k)=+1|y,d_(−k),λ₂        ^(e)) and P(d_(k)=−1|y,d_(−k),λ₂ ^(e)) and, hence, the elements        of the transition matrix for the Markov chain are fixed. This        implies that the Markov chain is homogeneous.    -   Any arbitrary pairs of states s_(i) and s_(j) are connected        through a sequence of edges. This means transition between any        arbitrary pairs of states s_(i) and s_(j) is possible. This in        turn implies that the multi-channel Markov chain is irreducible.    -   There is an edge connecting each state to itself. In other        words, for any s_(i), the probability of staying in the state is        non zero, i.e. π_(ii)>0. This implies that the multi-channel        Markov chain is aperiodic.

Since the multi-channel Markov chain is homogeneous, irreducible andaperiodic, it can be proven that it converges to a unique stationarydistribution.

Note that the a posteriori probability P(d|y,λ₂ ^(e)) is the stationarydistribution of the multi-channel Markov chain: the probability that themulti-channel Markov chain is in the state associated with d. It can beshown that P(d|y, λ₂ ^(e)) satisfies the reversibility rule, and thus,if the multi-channel Markov chain is run for sufficient number ofiterations, it converges toward the desired distribution P(d|y,λ₂ ^(e)).

In accordance with an embodiment of the invention, each step of the MCMCsimulation may begin with a random selection of one of the statevariable, d_(k). The next state is then determined according to theprocedure described above. In accordance with an alternate embodiment ofthe invention, the state variables may be scheduled such that they eachget turn on a regular basis. For example, one possible version of Gibbssampler for the multi-channel detector is summarized as follows:

-   -   Initialize d^((−N) ^(b) ⁾ (randomly),    -   for n=−N_(b)+1 to N_(s)    -   draw sample d₁ ^((n)) from P(d₁|d₂ ^((n−1)), . . . , d_(K)        ^((n−1)),y,λ₂ ^(e))    -   draw sample d₂ ^((n)) from P(d₂|d₁ ^((n)),d₃ ^((n−1)), . . . ,        d_(k) ^((n−1)),y,λ₂ ^(e))    -   draw sample d₂ ^((n)) from P(d_(K|d) ₁ ^((n)), . . . , d_(k−1)        ^((n)),y,λ₂ ^(e))        In this routine d^((−N) ^(b) ⁾ is initialized randomly,        optionally taking into account the a priori information λ₂ ^(e).        The ‘for’ loop examines the state variables, d_(k)'s, in order,        N_(b)+N_(s) times. The first N_(b) iterations of the loop,        called burn-in period, allows the Markov chain to converge to        near its stationary distribution. The samples used for LLR        computations are those of the last N_(s) iterations.

Alternately, in accordance with another embodiment of the presentinvention, a number of Gibbs samplers can be run in parallel. Thisintroduces parallelism in the detector implementation, which in turnallows drawing more Gibbs samples. Such an approach may prove usefulwhen available processing speed is a practical limitation. Moreover, thesamples drawn in successive iterations from a single Markov chain arecorrelated. In contrast, the parallel Gibbs samplers draw a pool ofsamples which are less correlated; generally only the samples thatbelong to the same Markov chain are correlated. Although the parallelGibbs samplers may be inefficient if a large burn-in period, N_(b), isused, the parallel Gibbs samplers can also be run with no burn-in periodat all, in accordance with yet another embodiment of the presentinvention. Therefore, the choice between single and parallel Markovchains is application dependent and even for a particular applicationthis choice may vary as the details of the detector implementation varyas will occur to one of skill in the art in possession of thisdisclosure.

An alternate method for estimating channel data probabilities in amulti-channel communications system receiver is illustrated in FIG. 2 inaccordance with an embodiment of the present invention. The method,indicated generally at 200, may include receiving 202 a multi-channelcommunications signal to form an observed signal. Various techniques forreceiving a multi-channel communications signal are known in the art asdiscussed within, and including for example a down-converting receiver.Another step in the method may include forming 204 a multi-channel dataprobability estimate from the observed signal. For example, forming achannel data probability estimate may be performed by soft decisionestimation in a multi-channel detector. The method may also includesimulating 206 a channel data distribution according to an auxiliarydistribution using a Markov chain Monte Carlo simulator. For example,simulating a channel data distribution may be performed as discussedabove. As discussed above, the auxiliary distribution may correspond toa channel data distribution determined for a low signal to noise ratio,or the auxiliary distribution may be correspond to a uniformdistribution.

A final step in the method may include refining 208 the multi-channeldata probability estimate using the channel data distribution. Forexample, refining the multi-channel data probability estimate may beperformed as discussed above. Refining the multi-channel dataprobability estimate may include exchanging extrinsic information with aforward error correction decoder as discussed further below.

As discussed above, reception of a multi-channel signal may involveexchanging information between a detector and a decoder. For example,FIG. 4 illustrates a multi-channel receiver, shown generally at 400, inaccordance with an embodiment of the present invention, which includes adetector 402 and a decoder 404 in a turbo decoding loop. The detectorprovides a set of soft output sequences λ₁ (LLRs) 412 for theinformation sequences d₁(i), d₂(i) , . . . , d_(K)(i), based on theobserved input vector sequence y(i) 410 and the extrinsic information (apriori soft information) 418 from the previous iteration of the decoder.After subtracting the extrinsic information from the LLR output 412 ofdetector, the extrinsic information λ₁ ^(e) 414 (remaining informationnew to the decoder) is passed to the decoder for further processing.Similarly, the extrinsic information (soft input) to each FEC decoder issubtracted from the decoder output λ₂ 416 to generate the extrinsicinformation λ₂ ^(e) 418 before being fed back to the detector.Typically, the turbo decoding loop includes an interleaver 408 anddeinterleaver 406 between the detector and decoder. Multiple iterationsof exchanging extrinsic information between the detector and decoder maybe used to refine estimates of the channel data. It has also been foundexperimentally that, performance may be enhanced by running the MCMCsimulation at a signal to noise ratio below the estimated signal tonoise ratio of the received signal and gradually increasing the signalto noise ratio used by the MCMC simulation as the detector and decoderiterate.

For d_(k)(i), λ₁(d_(k)(i)) is used to denote the soft output of SISOmulti-channel detector, λ₂(d_(k)(i)) to denote the soft output of theFEC decoder, and λ₁ ^(e)(d_(k)(i)) and λ₂ ^(e)(d_(k)(i)) to denote thecorresponding extrinsic information, respectively. Moreover, the vectorsλ₂ ^(e)(i)=[λ₂ ^(e)(d₁(i))λ₂ ^(e)(d₂(i)) , . . . ,λ₂ ^(e)(d_(k)(i))]^(T)and λ₁ ^(e)(k)=[λ₁ ^(e)(d_(k)(1))λ₁ ^(e)(d_(k)(2)) , . . . ,λ₁^(e)(d_(k)(M))]^(T) are defined. Note that λ₂ ^(e)(i) denotes theextrinsic information of all channels at time slot i, while λ₁ ^(e)(k)denotes the extrinsic information of channel k across one data block.

When data symbols are binary, taking values of +1 and −1, each symbolinformation is quantified by the LLR of a transmitted “+1” and atransmitted “−1”, given the input information, i.e.,

$\begin{matrix}{{{\lambda_{1}\left( {d_{k}(i)} \right)} = {\ln\;\frac{P\left( {{{d_{k}(i)} = \left. {+ 1} \middle| {y(i)} \right.},{\lambda_{2}^{e}(i)}} \right)}{P\left( {{{d_{k}(i)} = \left. {- 1} \middle| {y(i)} \right.},{\lambda_{2}^{e}(i)}} \right)}}},} & (2) \\{and} & \; \\{{\lambda_{2}\left( {d_{k}(i)} \right)} = {\ln\;{\frac{P\left( {{{d_{k}(i)} = \left. {+ 1} \middle| {\lambda_{1}^{e}(k)} \right.},{decoding}} \right)}{P\left( {{{d_{k}(i)} = \left. {- 1} \middle| {\lambda_{1}^{e}(k)} \right.},{decoding}} \right)}.}}} & (3)\end{matrix}$

The L-values in (3) can be obtained, for example, by following a turbodecoding algorithm. The decoder may include, for example, a set ofparallel channel decoders, one decoder for each channel, when forwarderror correction encoded for each channel is performed independently.Alternately, the decoder may include a number of channel decodersoperating across multiple channels each. Various suitable decoders foruse in this receiver will occur to one of skill in the art havingpossession of this disclosure. The data need not be limited to binary,as will be discussed further below.

Various implementations of a detector 402 will now be discussed infurther detail. For example, FIG. 5 illustrates a detector in accordancewith one embodiment of the present invention. The detector 402, mayinclude a receiver 502, a Gibbs sampler circuit 504, and a channel dataprobability estimator 506. The receiver is configured to receive amulti-channel signal and output an observed signal 510. For example, thereceiver may include a matched filter, such as a despreader orcorrelator for CDMA signals. The Gibbs sampler circuit is coupled to thereceiver and accepts the observed signal. The Gibbs sampler isconfigured to generate stochastically selected channel data hypothesis.For example, channel data hypotheses may be selected based in part onthe observed signal and a channel model using a MCMC simulator asdiscussed above. The channel data probability estimator 506 is coupledto the receiver 502 and the Gibbs sampler circuit 504, and is configuredto estimate channel data probabilities 514 from the observed signal 510and the stochastically selected channel data hypotheses 512. Forexample, in accordance with another embodiment of the present inventionthe channel data probability estimator may form the estimates of channeldata probabilities by performing a scaled summation of conditionalchannel data probabilities, where the conditional channel dataprobabilities are conditioned on the observed signal and channel datasamples and the extrinsic information provided by the decoder is takeninto account as discussed above. Scaling may be performed by taking intoaccount the auxiliary distribution selected.

In accordance with another embodiment of the present invention, thedetector may also include a log likelihood estimator 508 coupled to thechannel data probability estimator 506 and configured to estimatechannel data log likelihood estimates 516 from the channel dataprobabilities 514.

As discussed above, the detector 402 provides several advantages overdetectors disclosed in the prior art. In particular, the detector mayprovide more rapid convergence in the receiver than previously knowniterative techniques. Additionally, it may be possible to provide amulti-channel MIMO system wherein the number of transmit antennas isgreater than the number of receive antennas (a condition often referredto as “overloaded” due to the limited ability of prior art detectors toaccommodate such a situation), or a multi-channel CDMA system whereinthe number of users is greater than the spreading gain of the system.

Additional detail on certain detailed embodiments of the detector 402will now be provided. In developing a circuit implementation of thedetector, it is helpful to perform the processing in a log domain. Byoperating in the log domain, numerical stability is improved and lowerprecision may be used. This allows for smaller chip-size and very fastclock rates. The log-domain formulation also lends itself to a hardwarearchitecture that involves primarily addition, subtraction, and compareoperations. Moreover, pipelining can be introduced in the proposedarchitecture in straightforward manner. Although pipelining induces somedelay in the Gibbs sampler loop, the impact of this delay on thebehavior of the MCMC simulator has been found to be negligible.

Considering the general case where all the involved variables/constantsin the channel model of (1) is written more explicitly asy _(c) =A _(c) d _(c) +n _(c)   (1a)where the subscript ‘c’ is to emphasize the variables arecomplex-valued. Here, for convenience of formulations that follow, it isassumed that A_(c) has N/2 rows and K/2 columns. This implies that, inthe case of a MIMO system, there are K/2 transmit and N/2 receiveantennas and, in the case of a CDMA system, there are K/2 channels withspreading codes of length N/2. Accordingly, d_(c) has a length of K/2and y_(c) and n_(c) are of length N/2.

The realization of the detection algorithms is simplified by rearranging(1a) in terms of real and imaginary parts of the underlying variablesand coefficients. Such rearrangement is straightforward and leads to theequationy=Ad+n   (2a)where

$\begin{matrix}{{y = \begin{bmatrix}{\Re\left\{ y_{c,1} \right\}} \\{{??}\left\{ y_{c,1} \right\}} \\\vdots \\{\Re\left\{ y_{c,{N/2}} \right\}} \\{{??}\left\{ y_{c,{N/2}} \right\}}\end{bmatrix}},} \\{{A = \begin{bmatrix}{\Re\left\{ a_{c,1,1} \right\}} & {{- {??}}\left\{ a_{c,1,1} \right\}} & \; & {\Re\left\{ a_{c,1,{K/2}} \right\}} & {{- {??}}\left\{ a_{c,1,{K/2}} \right\}} \\{{??}\left\{ a_{c,1,1} \right\}} & {\Re\left\{ a_{c,1,1} \right\}} & \cdots & {{??}\left\{ a_{c,1,{K/2}} \right\}} & {\Re\left\{ a_{c,1,{K/2}} \right\}} \\\vdots & \; & ⋰ & \; & \vdots \\{\Re\left\{ a_{c,{N/2},1} \right\}} & {{- {??}}\left\{ a_{c,{N/2},1} \right\}} & \ldots & {\Re\left\{ a_{c,{N/2},{K/2}} \right\}} & {{- {??}}\left\{ a_{c,{N/2},{K/2}} \right\}} \\{{??}\left\{ a_{c,{N/2},1} \right\}} & {\Re\left\{ a_{c,{N/2},1} \right\}} & \; & {{??}\left\{ a_{c,{N/2},{K/2}} \right\}} & {\Re\left\{ a_{c,{N/2},{K/2}} \right\}}\end{bmatrix}},} \\{{d = \begin{bmatrix}{\Re\left\{ d_{c,1} \right\}} \\{{??}\left\{ d_{c,1} \right\}} \\\vdots \\{\Re\left\{ d_{c,{K/2}} \right\}} \\{{??}\left\{ d_{c,{K/2}} \right\}}\end{bmatrix}},} \\{{n = \begin{bmatrix}{\Re\left\{ n_{c,1} \right\}} \\{{??}\left\{ n_{c,1} \right\}} \\\vdots \\{\Re\left\{ n_{c,{N/2}} \right\}} \\{{??}\left\{ n_{c,{N/2}} \right\}}\end{bmatrix}},}\end{matrix}$

{·} and ℑ{·} denote the real and imaginary parts of the arguments,a_(c,i,j) is the ij^(th) element of A_(c), and y_(c,i), d_(c,i) andn_(c,i) are the i^(th) elements of y, d and n, respectively. Note thatwhen the elements of d_(c) are from an L×L quadrature amplitudemodulated (QAM) constellation, the elements of d are from an L-ary pulseamplitude modulation (PAM) alphabet. Also note that with the sizesspecified above, y and n will be vectors of length N, d is a vector oflength K, and A will be an N-by-K matrix. Many alternative definitionsof (2a) are also possible, including for example defining

$y = \begin{bmatrix}{\Re\left\{ y \right\}} \\{{??}\left\{ y \right\}}\end{bmatrix}$and selecting A, d, and n accordingly.

The disclosed inventive techniques can also accommodate situations wherethe transmitted symbols are non-binary (e.g. QPSK modulation). This canbe understood by the following derivation. First, define b as aninformation bit, and P(b=+1) and P(b=−1) as the probabilities of b beingequal to +1 and −1, respectively. If L=2^(M), each element of drepresents a set of M coded bits. Accordingly, d may be obtained througha mapping of a set of KM information bits b₁, b₂, . . . , b_(KM). Notethat the LLR values associated with these bits, can be obtained from theconditional probability P(b_(k)=+1|y,λ₂ ^(e)), for k=1,2, . . . , KM asdiscussed above. Here, λ₂ ^(e) is the set of a priori (extrinsic)information (the probabilities or LLR values) associated withinformation bits b₁, b₂, . . . , b_(KM) from the channel decoder. Definethe vectors b=[b₁ b₂, . . . , b_(KM)]^(T) and b_(−k)=[b₁, . . . ,b_(k−1)b_(k+1), . . . , b_(KM)]^(T), and note that

$\begin{matrix}\begin{matrix}{{P\left( {{b_{k} = \left. {+ 1} \middle| y \right.},\lambda_{2}^{e}} \right)} = {\sum\limits_{b_{- k}}{P\left( {{b_{k} = {+ 1}},\left. b_{- k} \middle| y \right.,\lambda_{2}^{e}} \right)}}} \\{{= {\sum\limits_{b_{- k}}{{P\left( {{b_{k} = \left. {+ 1} \middle| y \right.},b_{- k},\lambda_{2}^{e}} \right)}\;{P\left( {\left. b_{- k} \middle| y \right.,\lambda_{2}^{e}} \right)}}}},}\end{matrix} & \left( {4a} \right)\end{matrix}$where the second identity follows by applying the chain rule, and thesummation is over all possible values of b_(−k). As discussed above, thenumber of combinations that b_(−k) takes is equal to 2^(Km−1), and hencethe Gibbs sampler is used as discussed above to select importantsamples. In accordance with one embodiment of the present invention, theGibbs sampler may examine the successive elements of b and assign themrandom values such that the choices of b that result in small differencey−Ad are visited more frequently as discussed generally above.

In accordance with one embodiment of the present invention, to drawb_(k) ^((n)) in the Gibbs sampler, the distribution P(b_(k)|b₁ ^((n)), .. . , b_(k−1) ^((n)), b_(k+1) ^((n−1)), . . . , b_(KM) ^((n−1)),y,λ₂^(e)) is obtained, i.e., P(b_(k)=+1|b₁ ^((n)), . . . , b_(k−1) ^((n)),b_(k+1) ^((n−1)), . . , b_(KM) ^((n−1)),y,λ₂ ^(e)) and P(b_(k)=−1|b₁^((n)), . . . , b_(k−1) ^((n)), _(k−1) ^((n−1)), . . . , b_(KM)^((n−1)),y,λ₂ ^(e)). To derive equations for these probabilities, defineb _(−k) ^((n)) Δ[b ₁ ^((n)) , . . . , b _(k−1) ^((n)) b _(k+1) ^((n−1)), . . . , b _(KM) ^((n−1))]^(T)   (5a)and note that

$\begin{matrix}{{P\left( {{b_{k} = \left. {+ 1} \middle| y \right.},b_{- k}^{(n)},\lambda_{2}^{e}} \right)} = \frac{P\left( {{b_{k} = \left. {+ 1} \middle| y \right.},b_{- k}^{(n)},\lambda_{2}^{e}} \right)}{{P\left( {{b_{k} = \left. {+ 1} \middle| y \right.},b_{- k}^{(n)},\lambda_{2}^{e}} \right)} + {P\left( {{b_{k} = \left. {- 1} \middle| y \right.},b_{- k}^{(n)},\lambda_{2}^{e}} \right)}}} & \left( {6a} \right)\end{matrix}$since the summation in the denominator is equal to one. In (6a) andother equations that follow, b_(−k) ^((n)) is the short hand notationfor b_(−k)=b_(−k) ^((n)). Note also that

$\begin{matrix}{{P\left( {{b_{k} = \left. {+ 1} \middle| y \right.},b_{- k}^{(n)},\lambda_{2}^{e}} \right)} = \frac{{p\left( {\left. y \middle| b_{k^{+}}^{(n)} \right.,\lambda_{2}^{e}} \right)}\;{P\left( b_{k^{+}}^{(n)} \middle| \lambda_{2}^{e} \right)}}{p\left( {y,b_{- k}^{(n)},\lambda_{2}^{e}} \right)}} & \left( {7a} \right)\end{matrix}$where b_(k) ₊ ^((n))=[b₁ ^((n)) . . . b_(k−1) ^((n)),+1,b_(k+1) ^((n−1)). . . b_(KM) ^((n−1))]^(T). Also defined b_(k) ₊ =[b₁ ^((n)) . . .b_(k−1) ^((n)),−1, b_(k+1) ^((n−1)) . . . b_(KM) ^((n−1))]^(T). Notingthat p(y|b_(k) ₊ ^((n)),λ₂ ^(e))=p(y|b_(k) ₊ ^((n)), since when the datavector b is fully specified the extrinsic information become irrelevant,and substituting (7a) in (6a), after deleting the common terms,

$\begin{matrix}{{P\left( {{b_{k} = \left. {+ 1} \middle| y \right.},b_{- k}^{(n)},\lambda_{2}^{e}} \right)} = \frac{{p\left( y \middle| b_{k^{+}}^{(n)} \right)}\;{P^{e}\left( {b_{k} = {+ 1}} \right)}}{{{p\left( y \middle| b_{k^{+}}^{(n)} \right)}\;{P^{e}\left( {b_{k} = {+ 1}} \right)}} + {{p\left( y \middle| b_{k^{-}}^{(n)} \right)}\;{P^{e}\left( {b_{k} = {- 1}} \right)}}}} & \left( {8a} \right)\end{matrix}$where P^(e)(b_(k)=i)ΔP(b_(k)=i|λ₂ ^(e)(b_(k))), for i=±1, and theelements of b can be assumed independent of each other, hence,P(b _(k) ₊ ^((n))|λ₂ ^(e))=P ^(e)(b ₁ =b ₁ ^((n))) . . . P ^(e)(b _(k−1)=b _(k−1) ^((n)))P ^(e)(b _(k)=+1)P ^(e)(b _(k+1) =b _(k+1) ^((n−1))) .. . P ^(e)(b _(KM) =b _(KM) ^((n−1))).   (9a)

When A is known (or has been estimated) and the noise vector n isGaussian and satisfies

$\begin{matrix}{{{E\left\lbrack {n\; n^{H}} \right\rbrack} = {\frac{1}{2}\;\sigma_{n}^{2}I}},} & \; \\{{{p\left( y \middle| b_{k^{+}}^{(n)} \right)} = {\frac{1}{\left( {\pi\;\sigma_{n}^{2}} \right)^{N/2}}\;{\mathbb{e}}^{{- {{y - {A\; d_{k^{+}}^{(n)}}}}^{2}}/\sigma_{n}^{2}}}},} & \left( {10a} \right)\end{matrix}$where d_(k) ₊ ^((n)) is the vector of transmit symbols obtained throughmapping from b_(k) ₊ ^((n)). Although direct substitution of (10a) and asimilar equation with b_(k) ₊ ^((n)) replaced by b_(k) ⁻ ^((n)) in (8a)can be done, the result is computationally involved and may be sensitiveto numerical errors procedure. Hence, a log-domain implementation ispreferable as will now be described.

Defining

$\begin{matrix}{\gamma_{k^{+}}\overset{\Delta}{=}{{\ln\left( {P^{e}\left( {b_{k} = {+ 1}} \right)} \right)} - \frac{{{y - {A\; d_{k^{+}}^{(n)}}}}^{2}}{\sigma_{n}^{2}}}} & \left( {11a} \right) \\{and} & \; \\{\gamma_{k^{-}}\overset{\Delta}{=}{{\ln\left( {P^{e}\left( {b_{k} = {- 1}} \right)} \right)} - \frac{{{y - {A\; d_{k^{-}}^{(n)}}}}^{2}}{\sigma_{n}^{2}}}} & \left( {12a} \right)\end{matrix}$(8a) may be rearranged as

$\begin{matrix}{{P\left( {{b_{k} = \left. {+ 1} \middle| y \right.},b_{- k}^{(n)},\lambda_{2}^{e}} \right)} = {\frac{1}{1 + e^{- {({\gamma_{k^{+}} - \gamma_{k^{-}}})}}}.}} & \left( {13a} \right)\end{matrix}$Using (11a) and (12a), to obtain

$\begin{matrix}{{\gamma_{k^{+}} - \gamma_{k^{-}}} = {{\lambda_{2}^{e}\left( b_{k} \right)} - \frac{{{y - {A\; d_{k^{+}}^{(n)}}}}^{2} - {{y - {A\; d_{k^{-}}^{(n)}}}}^{2}}{\sigma_{n}^{2}}}} & \left( {14a} \right)\end{matrix}$where

${\lambda_{2}^{e}\left( b_{k} \right)} = {\ln\;{\frac{P^{e}\left( {b_{k} = {+ 1}} \right)}{P^{e}\left( {b_{k} = {- 1}} \right)}.}}$Noting that d_(k) ₊ ^((n)) and d_(k) ⁻ ^((n)) differ in one term,straightforward manipulations lead to

$\begin{matrix}{{\gamma_{k^{+}} - \gamma_{k^{-}}} = {{\lambda_{2}^{e}\left( b_{k} \right)} - {\frac{1}{\sigma_{n}^{2}}\left( {\left( d_{k^{+},p}^{(n)} \right)^{2} - \left( d_{k^{-},p}^{(n)} \right)^{2}} \right)\mspace{11mu} r_{pp}} + {\frac{2}{\sigma_{n}^{2}}\left( {d_{k^{+},p}^{(n)} - d_{k^{-},p}^{(n)}} \right)\left( {y_{p}^{mf} - {\sum\limits_{{q = 1},{q \neq p}}^{K}{r_{pq}d_{q}^{(n)}}}} \right)}}} & \left( {15a} \right)\end{matrix}$where y_(p) ^(mf)=a_(p) ^(T)y, r_(pq) is the pq th element of R=A^(T)A,d_(q) ^((n)) is the qth element of d^((n)), d_(k) ₊ _(,p) ^((n)) andd_(k) ⁻ _(,p) ^((n)) are the p th element of d_(k) ₊ ^((n)) and d_(k) ⁻^((n)), respectively, and b_(k) is mapped to d_(p) ^((n)). Note thatd_(q) ^((n)), depends on b_(k) when q=p, and the above notation reflectsthis fact. Equation (15a) may be further rearranged as

$\begin{matrix}\begin{matrix}{{\gamma_{k^{+}} - \gamma_{k^{-}}} = {{\lambda_{2}^{e}\left( b_{k} \right)} - {\kappa_{1}\; r_{pp}^{\prime}} + {\kappa_{2}\left( {y_{p}^{\prime\;{mf}} - {\sum\limits_{{q = 1},{q \neq p}}^{K}{r_{pq}^{\prime}d_{q}^{(n)}}}} \right)}}} \\{where} \\{{\kappa_{1} = {\frac{1}{2}\left( {\left( d_{k^{+},p}^{(n)} \right)^{2} - \left( d_{k^{-},p}^{(n)} \right)^{2}} \right)}},{\kappa_{2} = \left( {d_{k^{+},p}^{(n)} - d_{k^{-},p}^{(n)}} \right)},{y_{p}^{\prime\;{mf}} = {\frac{2}{\sigma_{n}^{2}}y_{p}^{mf}}}} \\{{and}\mspace{14mu}{r_{pq}^{\prime} = {\frac{2}{\sigma_{n}^{2}}{r_{pq}.}}}}\end{matrix} & \left( {16a} \right)\end{matrix}$Hence, y′_(p) ^(mf) and r′_(pq) can be pre-calculated prior to startingthe Gibbs sampler and thus be provided as inputs to it.

When transmitted symbols are QPSK, d=b, d_(k) ₊ _(,p) ^((n))=+1 andd_(k) ⁻ _(,p) ^((n))=−1, and accordingly (16a) reduces to

$\begin{matrix}{{\gamma_{k^{+}} - \gamma_{k^{-}}} = {{\lambda_{2}^{e}\left( b_{k} \right)} + {2\;{\left( {y_{p}^{\prime\;{mf}} - {\sum\limits_{{q = 1},{q \neq p}}^{K}{r_{pq}^{\prime}b_{q}^{(n)}}}} \right).}}}} & \left( {17a} \right)\end{matrix}$Moreover, since b_(q) ^((n))'s are binary, evaluation of the right-handside of (17a) involves primarily additions and subtractions.

In the case of more complex modulations, such as QAM, evaluation of(16a) is naturally more involved. However, since d_(q) ^((n))'s are froma relatively small alphabet, an add/subtract implementation is stillfeasible as will be apparent to one of skill in the art havingpossession of this disclosure.

Evaluation of (13a) involves a function of the form 1/(1e^(x)). Variousapproaches for implementing this function in hardware are possible,including using a lookup table or using the linear approximation:

$\begin{matrix}{\frac{1}{1 + e^{- x}} \approx {\frac{x}{2^{3}} + \frac{1}{2}}} & \left( {18a} \right)\end{matrix}$When this approximation is used, it is possible that the evaluated valueof P(b_(k)=+1|y,b_(−k) ^((n)),λ₂ ^(e)) exceeds one or becomes negative.When this happens, P(d_(k)=+1|y,b_(−k) ^((n)),λ₂ ^(e)) may be hardlimited to the maximum and minimum values of 1 and 0, respectively.

Using (25) above,

$\begin{matrix}{{\lambda_{1}^{e}\left( b_{k} \right)} = {\ln\;\frac{\sum\limits_{n = 1}^{N_{s}}{{p\left( {\left. y \middle| b_{- k}^{(n)} \right.,{b_{k} = {+ 1}}} \right)}\;{P^{e}\left( b_{- k}^{(n)} \right)}}}{\sum\limits_{n = 1}^{N_{s}}{{p\left( {\left. y \middle| b_{- k}^{(n)} \right.,{d_{k} = {- 1}}} \right)}\;{P^{e}\left( b_{- k}^{(n)} \right)}}}}} & \left( {19a} \right)\end{matrix}$whereP ^(e)(b _(−k) ^((n)))=P ^(e)(b ₁ =b ₁ ^((n))) . . . P ^(e)(b _(k−1) =b_(k−1) ^((n)))P ^(e)(b _(k+1) =b _(k+1) ^((n−1))) . . . P ^(e)(b _(KM)=b _(KM) ^((n−1))).   (20a)To ensure that the samples b^((n)) are different, repetitions of b^((n))may be deleted while running the Gibbs sampler. An efficientimplementation can be developed by defining

$\begin{matrix}{\eta_{k^{+}}^{(n)}\overset{\Delta}{=}{{\ln\mspace{11mu} P^{e}\;\left( b_{- k}^{(n)} \right)} - \frac{{{y - {A\; d_{k^{+}}^{(n)}}}}^{2}}{\sigma_{n}^{2}}}} & \left( {21a} \right) \\{and} & \; \\{\eta_{k^{-}}^{(n)}\overset{\Delta}{=}{{\ln\mspace{11mu} P^{e}\;\left( b_{- k}^{(n)} \right)} - \frac{{{y - {A\; d_{k^{-}}^{(n)}}}}^{2}}{\sigma_{n}^{2}}}} & \left( {22a} \right)\end{matrix}$and noting that (19a) can be rearranged as

$\begin{matrix}{{\lambda_{1}^{e}\left( b_{k} \right)} = {{\ln\left( {\sum\limits_{n = 1}^{N_{s}}{\mathbb{e}}^{\eta_{k^{+}}^{(n)}}} \right)} - {{\ln\left( {\sum\limits_{n = 1}^{N_{s}}{\mathbb{e}}^{\eta_{k^{-}}^{(n)}}} \right)}.}}} & \left( {23a} \right)\end{matrix}$

Additional simplification can be obtained using the definition

$\begin{matrix}\begin{matrix}{{{\max^{*}\left( {x,y} \right)} = {{\max\left( {x,y} \right)} + {\ln\left( {1 + e^{- {{x - y}}}} \right)}}},{where}} \\{{{\max^{*}\left( {x,y} \right)}\overset{\Delta}{=}{\ln\left( {e^{x} + e^{y}} \right)}},}\end{matrix} & \left( {24a} \right)\end{matrix}$and its recursive extension

$\begin{matrix}\begin{matrix}{{{\max^{*}\left( {x,y,z} \right)} = {\max^{*}\left\lbrack {{\max^{*}\left( {x,y} \right)},z} \right\rbrack}},{where}} \\{{\max^{*}\left( {x,y,z} \right)}\overset{\Delta}{=}{{\ln\left( {e^{x} + e^{y} + e^{z}} \right)}.}}\end{matrix} & \left( {25a} \right)\end{matrix}$

Computation of ln(1+e^(−|x−y|)) may be done through the use of a smalllook-up table. Alternately, the approximation

$\begin{matrix}{{\lambda_{1}^{e}\left( b_{k} \right)} \approx {{\max\limits_{n}\;\eta_{k^{+}}^{(n)}} - {\max\limits_{n}\;{\eta_{k^{-}}^{(n)}.}}}} & \left( {26a} \right)\end{matrix}$can be used. This approximation significantly simplifies theimplementation of the detector. In (23a), each new sample of b is to becompared with the previous elements and deleted if it is a repeatsample. When N_(s) is large this may be a complexity burden.Implementation of (26a), in contrast, is relatively simple. As samples bare generated, the respective values η_(k) ₊ ^((n)) and η_(k) ⁻ ^((n))are computed. Each of these are then compared their counterpart from theprevious Gibbs samples, i.e., with η_(k+) ^((n−1)) and η_(k) ⁻^((n−1)),respectively, and replaced if they are larger or discarded ifthey are smaller. This also avoids the need to search for repetitions ofb; a significant savings since the number of comparisons needed fordeleting repetitions grows with the square of N_(s).

An intermediate solution that also avoids the complexity burden of(23a), while still performing better than (26a) is to keep a handfullargest samples of η_(k) ₊ ^((n)) and η_(k) ⁻ ^((n)) for application inan equation similar to (23a). Implementation of these results in circuitform will now be presented.

FIG. 6 illustrates a Gibbs sampler, in accordance with an embodiment ofthe present invention. The Gibbs sampler includes front-end processing602. The front-end processing takes y, A and σ_(n) ² as inputs andcalculates y′^(mf) and

$R^{\prime} = {\frac{2}{\sigma_{n}^{2}}{R.}}$The channel response, A and noise variance σ_(n) ², may be known or maybe estimated by the receiver using techniques known to one of skill inthe art.

Substituting (16a) in (18a),

$\begin{matrix}{{P\left( {{b_{k} = {{+ 1}❘y}},b_{- k}^{(n)},\lambda_{2}^{e}} \right)} \approx {\frac{{\lambda_{2}^{e}\left( b_{k} \right)} - {\kappa_{1}r_{pp}^{\prime}} + {\kappa_{2}\left( {y_{p}^{\prime{mf}} - {\sum\limits_{{q = 1},{q \neq p}}^{K}{r_{pq}^{\prime}d_{q}^{(n)}}}} \right)}}{2^{3}} + {\frac{1}{2}.}}} & \left( {27a} \right)\end{matrix}$In the Gibbs sampler 504, the next choice of b_(k) is made by selectingb_(k)=+1 with the probability P(b_(k)=+1|y,b_(−k) ^((n)),λ₂ ^(e)). Torealize this in hardware, a random variable, v, with uniformdistribution in the interval 0 to 1 may be generated, and b_(k) is setequal to +1 if P(b_(k)=+1|y,b_(−k) ^((n)),λ₂ ^(e))−v>0, otherwise b_(k)is set equal to −1. Alternatively, and more conveniently here, thelatter inequality can be written as 2³(P(b_(k)=+1|y,b_(−k) ^((n)),λ₂^(e))−0.5)−u>0, where u=2³(v−0.5) is a random with uniform distributionin the interval −4 to +4. The random variable u can be easily generatedusing a linear feedback shift register (LFSR) circuit 628. Various LFSRcircuits are known in the art.

The R′ register 604 stores a matrix of elements r′_(pq). The diagonalelements of R′ are forced to zero in order to exclude the case q=p asmotivated by (27a). Since in practice d_(i)'s are from a small alphabetand accordingly the number of possible values of κ₁ and κ₂ are small,each multiplier 606, 612, 616 can optionally be implemented byadd/subtract operations. Hence, the Gibbs sampler can be implemented ina multiplier-free realization. The rest of the operation of the Gibbssampler thus follows from (27a).

Computation of L-values is done according to (21a), (22a) and (26a) byapplying various approximations. Although, here, to simplify thediscussion, use of (26a) is emphasized, the developed circuit, withminor modification as will occur to one of skill in the art, is alsoapplicable to (23a).

From (21a) and (22a), note that the computation of η_(k+) ^((n)) andη_(k−) ^((n)) requires computation of ln P^(e)(b_(−k) ^((n))) and

$\frac{{{y - {Ad}^{(n)}}}^{2}}{\sigma_{n}^{2}}.$Continuing with the description of procedures for computation of theseterms, note that

${\lambda_{2}^{e}\left( b_{k} \right)} = {{\ln\;\frac{P^{e}\left( {b_{k} = {+ 1}} \right)}{P^{e}\left( {b_{k} = {- 1}} \right)}} = {\ln\;\frac{P^{e}\left( {b_{k} = {+ 1}} \right)}{1 - {P^{e}\left( {b_{k} = {+ 1}} \right)}}}}$and using this obtain, for i=±1,

$\begin{matrix}\begin{matrix}{{\ln\;{P^{e}\left( {b_{k} = i} \right)}} = {\ln\;\frac{1}{1 + {\mathbb{e}}^{- {{\mathbb{i}\lambda}_{2}^{e}{(b_{k})}}}}}} \\{= {- {\max^{*}\left( {0,{- {{\mathbb{i}\lambda}_{2}^{e}\left( b_{k} \right)}}} \right)}}} \\{\approx {- {\max\left( {0,{- {{\mathbb{i}\lambda}_{2}^{e}\left( b_{k} \right)}}} \right)}}} \\{= {{\min\left( {0,{{\mathbb{i}\lambda}_{2}^{e}\left( b_{k} \right)}} \right)}.}}\end{matrix} & \left( {28a} \right)\end{matrix}$Using (28a) and ψ_(k) ^((n)) as an approximation to ln P^(e)(b_(k)^((n)), where b_(k) ^((n))=[b₁ ^((n) . . . b) _(k) ^((n))b_(k+1)^((n−1)) . . . b_(KM) ^((n−1))]^(T), and noting that because of theinterleaving effect the information bits are independent,

$\begin{matrix}{\psi_{k}^{(n)} = {{\sum\limits_{l = 1}^{k}{\min\left( {0,{b_{l}^{(n)}{\lambda_{2}^{e}\left( b_{k} \right)}}} \right)}} + {\sum\limits_{l = {k + 1}}^{KM}{{\min\left( {0,{b_{l}^{({n - 1})}{\lambda_{2}^{e}\left( b_{k} \right)}}} \right)}.}}}} & \left( {29a} \right)\end{matrix}$Using (29a), ψ_(k) ^((n)) can be updated according to the followingrule:

-   -   if b_(k) ^((n))≠b_(k) ^((n−n)) then ψ_(k) ^((n))=ψ_(k)        ^((n−1))−b_(k) ^((n−1))λ₂ ^(e)(b_(k)) else ψ_(k) ^((n))=ψ_(k)        ^((n−1)).        Note that ψ_(k) ^((n))=ψ_(k) ^((n−1))−min(0,b_(k) ^(n−1))λ₂        ^(e)(b_(k)))+min(0,b_(k) ^((n))λ₂ ^(e)(b_(k))), which can be        simplified to        ψ_(k) ^((n))=ψ_(k) ^((n−1)) −b _(k) ^((n−1))λ₂ ^(e)(b _(k))          (29b)        when b_(k) ^((n))≠b_(k) ^((n−1)). Once ψ_(k) ^((n)) is        available, ln P^(e)(b_(−k) ^((n)) can be calculated by removing        the contribution of b_(k) ^((n)), viz.,        ln P ^(e)(b _(−k) ^((n)))=ψ_(k) ^((n))+max(0,b _(k) ^((n))λ₂        ^(e)(b _(k))).   (30a)

FIG. 7 illustrates a ψ-calculator circuit 700 based on the above result,in accordance with an embodiment of the present invention. When b_(k)^((n))≠b_(k) ^((n−n)), the ψ register 702 is enabled and its content isupdated according to (29b) using summer 704 for calculate the new value.Following (30a), ln P^(e)(b_(−k) ^((n))) is determined by the presentcontent of ψ register, when b_(k) ^((n−1))λ₂ ^(e)(b_(k)) is positive, orby the updated ψ, when b_(k) ^((n−1))λ₂ ^(e)(b_(k)) is negative byselecting the appropriate resulting using a multiplexer 706 (oralternately, multiplexer 706 may be implemented by a switch) controlledby a sign bit extractor 708. Note that ψ is insensitive to a bias sincethe subtraction of the two terms on the right-hand side of (26a) (andalso (23a)) removes such a bias. Hence, ψ can be assigned essentiallyany arbitrary initial value.

Next, a circuit for computation of

$\frac{{{y - {Ad}_{k^{+}}^{(n)}}}^{2}}{\sigma_{n}^{2}}\mspace{14mu}{and}\mspace{14mu}\frac{{{y - {Ad}_{k^{-}}^{(n)}}}^{2}}{\sigma_{n}^{2}}$is described. Let

$\xi_{k^{+}}^{(n)} = {{\frac{{{y - {Ad}_{k^{+}}^{(n)}}}^{2}}{\sigma_{n}^{2}}\mspace{14mu}{and}\mspace{14mu}\xi_{k^{-}}^{(n)}} = \frac{{{y - {Ad}_{k^{-}}^{(n)}}}^{2}}{\sigma_{n}^{2}}}$and note that the difference ξ_(k) ₊ ^((n))−ξ_(k) ⁻ ^((n))=δ isavailable from the Gibbs sampler. Accordingly, updating ξ_(k) ₊ ^((n))and ξ_(k) ⁻ ^((n)) may be performed according to the rule:

if  b_(k)^((n)) ≠ b_(k)^((n − 1)) if  b_(k)^((n)) = +1ξ_(k⁺)^((n)) = ξ + δ; ξ_(k⁻)^((n)) = ξ; ξ = ξ + δ elseξ_(k⁺)^((n)) = ξ; ξ_(k⁻)^((n)) = ξ − δ; ξ = ξ − δ elseif  b_(k)^((n)) = +1 ξ_(k⁺)^((n)) = ξ; ξ_(k⁻)^((n)) = ξ − δ elseξ_(k⁺)^((n)) = ξ + δ; ξ_(k⁻)^((n)) = ξ

Here, ξ is an auxiliary variable that helps with the implementation ofthe procedure. FIG. 8 illustrates ξ-calculator circuit to calculate ξ,in accordance with an embodiment of the present invention. The contentof the ξ register 802 is updated whenever b_(k) ^((n))≠b_(k) ^((n−1)).It is updated to the value of ξ+δ (computed by adder 804) when b_(k)^((n))=+1 and to the value of ξ−δ when b_(k) ^((n))−1.

Finally, a log-likelihood calculator circuit for computing the LLRvalues is shown in FIG. 9 in accordance with an embodiment of thepresent invention. The log-likelihood calculator 900 uses theξ-calculator circuit 800 and ψ-calculator circuit 700. Thelog-likelihood calculator implements (26a).

As shown above, one step of the Gibbs sampler uses one clock cycle.Hence one cycle of the Gibbs samplers use KMl clock cycles to execute.In previously known MCMC simulation, the content of b is considered as asample of the Markov chain at the end of each iteration of the Gibbssampler. In contrast, here the transitional samples b_(k) ^((n)=[b) ₁^((n)) . . . b_(k) ^((n) b) _(k+1) ^((n−1)) . . . b_(KM) ^((n−1))]^(T)are used. This modification reduces the hardware complexity andincreases the speed of operation of the circuit. This modificationappears to result in minor impact on the receiver performance.

As will occur to one of skill in the art, each of the above illustratedcircuit may include pipelining to increase throughput. For example, abuffer can be placed after each adder. Similiary, the multipliers canalso be implemented using a combination of adders and substractors, withpipelining. Accordingly, the clock rate of the circuit can be increasedto a rate where one addition is performed each clock period. Of course,it will be recognized that additional delays may be used in order totime align various parameters. For example, the b values may be delayedin order to ensure they time-correspond to values of δ, which havelonger pipeline delays in their computation.

A pipelined implementation will introduce a delay in the Gibbs samplercircuit, resulting in the Gibbs sampler drawing samples b_(k) ^((n))from the distribution P(b_(k|b) _(1,) ^((n)), . . . , b_(k−Δ−1) ^((n)),. . . , b_(k−Δ) ^((n−1)),b_(k+1) ^((n−1)), . . . , b_(KM) ^((n−1)),y,λ₂^(e)) rather than the initially discussed distribution P(b_(k|b) ₁^((n)), . . . , b_(k−1) ^((n)), . . . , b_(KM) ^((n−1)),y,λ₂ ^(e)). Thisdoes not appear to have an appreciate impact of the performance of thedetector.

In summary, efficient hardware implementations of the detector usingadd, subtract and compare operations are illustrated. Suchimplementations are amenable to pipelining to increase the clock speedof operation. Additional benefits of the implementation are relativelysmall word lengths (e.g. 8-12 bit precision). Although the above hasbeen described with particular reference to digital components, e.g. asimplemented in an application specific integrated circuit or fieldprogrammable gate array, it is to be understood that the circuits canequally be implemented using a general purpose processor or digitalsignal processor.

Finally, it is noted that the mathematical derivation of operation ofthe presently disclosed inventive techniques made certain assumptions,e.g. Gaussian white noise and independent data. These assumptions arenot essential to the operation of the disclosed embodiments, asexcellent performance can be obtained in many situations which do notexactly fit these mathematical assumptions.

It is to be understood that the above-referenced arrangements are onlyillustrative of the application for the principles of the presentinvention. Numerous modifications and alternative arrangements can bedevised without departing from the spirit and scope of the presentinvention. While the present invention has been shown in the drawingsand fully described above with particularity and detail in connectionwith what is presently deemed to be the most practical and preferredembodiment(s) of the invention, it will be apparent to those of ordinaryskill in the art that numerous modifications can be made withoutdeparting from the principles and concepts of the invention as set forthherein.

1. A method for estimating channel data bit probabilities of amulti-channel signal received through a communication channelcomprising: selecting a subset of hypothetical channel data patternsusing a Markov chain Monte Carlo simulation to stochastically select aplurality of hypothetical channel data patterns corresponding todominant conditional bit probability terms, wherein the conditional bitprobability terms are conditioned on an observation of the multi-channelsignal and the hypothetical channel data patterns; and summing thedominant conditional bit probability terms to obtain a channel data bitprobability summation.
 2. The method of claim 1 further comprisingsetting transition probabilities of the Markov chain Monte Carlosimulation to correspond to conditional channel data bit probabilitiesbased on a model of the communication channel.
 3. The method of claim 1further comprising setting transition probabilities of the Markov chainMonte Carlo simulation to correspond to an auxiliary probabilitydistribution.
 4. The method of claim 3 further comprising scaling theconditional bit probability based on the auxiliary probabilitydistribution.
 5. The method of claim 3 wherein the auxiliarydistribution is a uniform distribution.
 6. The method of claim 3 whereinthe auxiliary distribution is a based on a model of the communicationchannel at a signal to noise ratio below an estimated operating signalto noise ratio.
 7. The method of claim 1 further comprising forming alog likelihood ratio estimate from the channel data bit probabilitysummation.
 8. The method of claim 7 further comprising derivingextrinsic information from the log likelihood ratio estimate.
 9. Themethod of claim 8 further comprising providing the extrinsic informationto a forward error correction decoder.
 10. The method of claim 1 furthercomprising incorporating extrinsic information from a forward errorcorrection decoder into the channel data bit probability summation. 11.The method of claim 1 wherein the communication channel is chosen fromthe group consisting of a code division multiple access channel and amultiple input multiple output channel.
 12. The method of claim 1further comprising running the Markov chain Monte Carlo simulation for aburn-in period prior to selecting the subset of hypothetical channeldata patterns.
 13. The method of claim 1 further comprising initiating aplurality of parallel Markov chain Monte Carlo simulations tostochastically select a plurality of hypothetical channel data patternscorresponding to dominant conditional bit probability terms in thechannel data bit probability summation.
 14. A detector for estimatingchannel data probabilities in a multi-channel communication systemcomprising: a receiver configured to receive a multi-channel signal andoutput an observed signal; a Gibbs sampler circuit coupled to thereceiver and configured to generate stochastically selected channel datahypotheses based in part on the observed signal and a channel model; anda channel data probability estimator coupled to the receiver and coupledto the Gibbs sampler circuit and configured to estimate channel dataprobabilities from conditional data probabilities conditioned on theobserved signal and the stochastically selected channel data hypotheseswherein, the Gibbs sampler circuit further comprises a Markov chainMonte Carlo simulator coupled to the receiver and configured to generatechannel data samples from an auxiliary distribution wherein transitionprobabilities are based in part on the observed signal, and wherein theauxiliary distribution is based on the channel model using a signal tonoise ratio below an estimated operating signal to noise ratio of themulti-channel communication system.
 15. The detector of claim 14 whereinthe channel data probability estimator further comprises a loglikelihood ratio estimator coupled to the channel data probabilityestimator and configured to estimate channel data log likelihood ratiosfrom the channel data probabilities.
 16. The detector of claim 14further comprising a decoder coupled to the channel data probabilityestimator and configured to exchange extrinsic information with thechannel data probability estimator.
 17. A device for estimating channeldata bit probabilities of a multi-channel signal received through acommunication channel comprising: means for selecting a subset ofhypothetical channel data patterns using a Markov chain Monte Carlosimulation to stochastically select a plurality of hypothetical channeldata patterns corresponding to dominant conditional bit probabilityterms, wherein the conditional bit probability terms are conditioned onan observation of the multi-channel signal and the hypothetical channeldata patterns; and means for summing the dominant conditional bitprobability terms to obtain a channel data bit probability summation.18. The device of claim 17 further comprising means for forming a loglikelihood ratio estimate from the channel data bit probabilitysummation.
 19. The device of claim 17 wherein the communication channelis chosen from the group consisting of a code division multiple accesschannel and a multiple input multiple output channel.
 20. The device ofclaim 17 wherein the means for selecting a subset of hypotheticalchannel data patterns comprises a plurality of parallel markov chainMonte Carlo simulations.